Optimizing Selective Search in Chess 



O 



< 



> 

o 
in 

p 

ON 

o 
o 



X 



Omid David- Tabibi 

Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel 
Moshe Koppel 

Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel 



MAIL(gOMIDDAVID . COM 



KOPPEL (aCS.BIU. AC. IL 



Nathan S. Netanyahu NATHAN@CS.biu.AC.il 
Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel, and Center for Automation 
Research, University of Maryland, College Park, MD 20742 



Abstract 

In this paper we introduce a novel method for 
automatically tuning the search parameters 
of a chess program using genetic algorithms. 
Our results show that a large set of parame- 
ter values can be learned automatically, such 
that the resulting performance is comparable 
with that of manually tuned parameters of 
top tournament-playing chess programs. 



1. Introduction 

Until the mid-1970s most chess programs attempted to 
perform search by mimicking the way humans think, 
i.e., by generating "plausible" moves. By using exten- 
sive chess knowledge, these programs selected at each 
node a few moves which they considered plausible, 
thereby pruning large parts of the search tree. How- 
ever, as soon as brute-force search programs like Tech 
(Gillogly, 1972) and Chess 4.x (Slate and Atkin, 
1983) managed to reach depths of 5 plies and more, 
plausible move generating programs frequently lost to 
these brute-force searchers due to their significant tac- 
tical weaknesses. Brute-force searchers rapidly domi- 
nated the computer chess field. 

The introduction of null-move pruning (Beal, 1989; 
Donninger, 1993) in the early 1990s marked the end 
of an era, as far as the domination of brute-force pro- 
grams in computer chess is concerned. Unlike other 
forward-pruning methods which had great tactical 
weaknesses, null-move pruning enabled programs to 
search more deeply with minor tactical risks. Forward- 
pruning programs frequently outsearched brute-force 
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searchers, and started their own reign which has con- 
tinued ever since; they have won all World Computer 
Chess Championships since 1992. Deep Blue (Ham- 
milton and Garber, 1997; Hsu, 1999) was probably the 
last brute-force searcher. 

Nowadays, top tournament-playing programs use a 
range of methods for adding selectivity to their search. 
The most popular methods include null-move prun- 
ing, futility pruning (Heinz, 1998), multi-cut pruning 
(Bjornsson and Marsland, 1998; Bjornsson and Mars- 
land, 2001), and selective extensions (Anantharaman, 
1991; Beal and Smith, 1995). For each of these meth- 
ods, a wide range of parameter values can be set. For 
example, different reduction values can be used for 
null-move pruning, various thresholds can be used for 
futility pruning, etc. 

For each chess program, the parameter values for 
various selective search methods are manually tuned 
through years of experiments and manual optimiza- 
tions. In this paper we introduce a novel method for 
automatically tuning the search parameters of a chess 
program using genetic algorithms (GA). 

In the following section, we review briefiy the main 
methods that have been used for selective search. For 
each of these methods, we enumerate the parameters 
that need to be optimized. Section 3 provides a re- 
view of past attempts at automatic learning of various 
parameters in chess. In Section 4 we present our au- 
tomatic method of optimizing the parameters in ques- 
tion, which is based on the use of genetic algorithms, 
and in Section 5 we provide experimental results. Sec- 
tion 6 contains concluding remarks. 

2. Selective Search in Chess 

In this section we review several popular methods for 
selective search. All these methods work within the al- 
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phabeta/PVS framework and introduce selectivity in 
various forms. A simple alphabeta search requires the 
search tree to be developed to a fixed depth in each it- 
eration. Forward pruning methods, such as null-move 
pruning, futility pruning, and multi-cut pruning, en- 
able the program to prune some parts of the tree at 
an earlier stage, and devote the time gained to other, 
more promising parts of the search tree. 

Selective extensions, on the other hand, extend cer- 
tain parts of the tree to be searched deeper, due to 
tactical considerations associated with a position in 
question. The following subsections briefly cover each 
of these pruning and extension methods, and specify 
which parameters should be tuned for each method. 

2.1. Null-Move Pruning 

Null-move pruning (Beal, 1989; David- Tabibi and Ne- 
tanyahu, 2008b; Donninger, 1993) is based on the as- 
sumption that "doing nothing" in every chess position 
(i.e., doing a null- move) is not the best choice even if 
it were a legal option. In other words, the best move 
in any position has to be better than the null-move. 
This assumption enables the program to establish a 
lower bound a on the position by conducting a null- 
move search. The idea is to make a null- move, i.e., 
merely swap the side whose turn it is to move. (Note 
that this cannot be done in positions where the side 
to move is in check, since the resulting position would 
be illegal. Also, two null-moves in a row are forbid- 
den, since they result in nothing.) A regular search is 
then conducted with reduced depth R. The returned 
value of this search can be treated as a lower bound 
on the position's strength, since the value of the best 
(legal) move has to be better than that obtained from 
the null-move search. In a negamax framework, if the 
returned value is greater than or equal to the current 
upper bound (i.e., value > P), it results in a cutoff 
(fail-high). Otherwise, if the value is greater than the 
current lower bound (i.e., a < value < /3), we define 
a narrower search window, as the returned value be- 
comes the new lower bound. If the value is smaller 
than the current lower bound, it does not contribute 
to the search in any way. The main benefit of the 
null-move concept is the pruning obtained due to the 
cutoffs, which take place whenever the returned value 
of the null-move search is greater than the current up- 
per bound. Thus, the best way to apply null-move 
pruning is by conducting a minimal- window null-move 
search around the current upper bound /3, since such a 
search will require a reduced search effort to determine 
if a cutoff takes place. 

Donninger (1993) was the first to suggest an adap- 
tive rather than a fixed value for R. Experiments 
conducted by Heinz in his article on adaptive null- 
move pruning (1999) showed that, indeed, an adaptive 



rather than a fixed value could be selected for the re- 
duction factor. By using i? = 3 in upper parts of the 
search tree and i? = 2 in its lower parts (close to the 
leaves) pruning can be achieved at a smaller cost (as 
null-move searches will be shallower in comparison to 
using a fixed reduction value of i? = 2) while maintain- 
ing the overall tactical strength. An in-depth review 
of null-move pruning and our extended null-move re- 
ductions improvement can be found in (David- Tabibi 
and Netanyahu, 2008b). 

Over the years many variations of null-move pruning 
have been suggested, but the set of key parameters to 
be determined has remained the same. These param- 
eters are: (1) the reduction value R, (2) the Boolean 
adaptivity variable, and (3) the adaptivity depth for 
which the decremented value of R is applied. 

2.2. Futility Pruning 

Futility pruning and extended futility pruning (Heinz, 
1998) suggest pruning nodes near a leaf where the sum 
of the current static evaluation value and some thresh- 
old (e.g., the value of a knight) is smaller than a. In 
these positions, assuming that the value gained in the 
remaining moves until reaching the leaf is not greater 
than the threshold, it is safe to assume that the po- 
sition is "weak enough", i.e., that it is worth pruning 
(as its score will not be greater than a). Naturally, 
the larger the threshold, the safer it is to apply futility 
pruning, although fewer nodes will be pruned. 

The main parameters to be set for futility pruning are: 
(1) the futility depth and (2) the futility thresholds for 
various depths (usually up to a depth of 3 plies). 

2.3. Multi-Cut Pruning 

Bjornsson and Marsland's multi-cut pruning (1998; 
2001) suggests searching the moves at a given posi- 
tion to a shallower depth first, such that if several of 
them result in a cutoff, the current node is pruned 
without conducting a full depth search. The idea is 
that if there are several moves that produce a cutoff 
at a shallower depth, there is a high likelihood that at 
least one of them will produce a cutoff if searched to 
a full depth. In order to apply multi-c\it priming only 
to potentially promising nodes, it is applied only to 
cut-nodes (i.e., nodes at which a cutoff has occurred 
previously, according to a hash table indication). 

The primary parameters that should be set in multi- 
cut pruning are: (1) the depth reduction value, (2) the 
depth for which multi-cut is applied, (3) the number 
of moves to search, and (4) the number of cutoffs to 
require. 
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2.4. Selective Extensions 

Selective extensions (Anantharaman, 1991; Beal and 
Smith, 1995) are used for extending potentially critical 
moves to be searched deeper. The following is a list of 
major extensions used in most programs: 

Check extension: Extend the move if it checks the 
opponent's king. 

One-reply extension: Extend the move if it is the 
only legal move. 

Recapture extension: Extend the move if it is a 
recapture of a piece captured by the opponent (such 
moves are usually forced). 

Passed pawn extension: Extend the move if it in- 
volves moving a passed pawn (usually to 7th rank). 
Mate threat extension: Extend the move if the 

null-move search returns a mate score (the idea is that 
if doing a null-move results in being checkmated, a 
potential danger lies at the horizon, so we extend the 
search to find the threat). 

For each of the above extensions, fractional extensions 
have been widely employed. These are implemented 
usually by defining one ply to be a number greater than 
one (e.g., 1 ply = 4 units), such that several fractional 
extensions along a line (i.e., a series of moves from the 
root to a leaf) cause a full ply extension. For example, 
if a certain extension is defined as half a ply, two such 
extensions must occur along a line in order to result in 
an actual full ply extension. For each extension type, 
a value is defined (e.g., assuming that 1 ply = 4 units, 
an extension has a value between to 4). 

From this brief overview of selective search, there are a 
number of parameters for each method which have to 
be set and tuned. Currently, top tournament-playing 
programs use manually tuned values which take years 
of trial and improvement to fine tune. In the next sec- 
tion we review the limited success of past attempts at 
automatic learning of the values of these search pa- 
rameters, and in Section 4 we present our GA-based 
method for doing so. 

3. Automatic Tuning of Search 
Parameters 

The selective search methods covered in the previ- 
ous section are employed by most of the current top 
tournament-playing chess programs. They use manu- 
ally tuned parameter values that were arrived at after 
years of experiments and manual optimizations. 

Past attempts at automatic optimization of search pa- 
rameters have resulted in limited success. Moriarty 
and Miikkulainen (1994) used neural networks for tun- 
ing the search parameters of an Othello program, but 
as they mention in their paper, their method is not 
easily applicable to more complex games such as chess. 



Temporal diff'erence learning has been successfully ap- 
plied in backgammon and checkers (Schaeffer, Hlynka, 
and Jussila, 2001; Tcsauro, 1992). Although the lat- 
ter has also been applied to chess (Baxter, Tridgell, 
and Weaver, 2000), the results show that after three 
days of learning, the playing strength of the program 
was only 2150 Elo, which is a very low rating for a 
chess program. Block et al. (2008) reported that using 
reinforcement learning, their chess program achieves 
a playing strength of only 2016 Elo. Veness et al.'s 
(2009) work on bootstrapping from game tree search 
improved upon previous work, but their resulting chess 
program reached a performance of between 2154 to 
2338 Elo, which is still considered a very low rating 
for a chess program. Kocsis and Szepesvari's (2006) 
work on universal parameter optimization in games 
based on SPSA does not provide any implementation 
for chess. 

Bjornsson and Marsland (2002) presented a method 
for automatically timing search extensions in chess. 
Given a set of test positions (for which the correct 
move is predetermined) and a set of parameters to be 
optimized (in their case, four extension parameters), 
they tune the values of the parameters using gradient- 
descent optimization. Their program processes all the 
positions and records, for each position, the number 
of nodes visited before the solution is found. The goal 
is to minimize the total node count over all the po- 
sitions. In each iteration of the optimization process, 
their method modifies each of the extension parame- 
ters by a small value, and records the total node count 
over all the positions. Thus, given N parameters to 
optimize (e.g., N = 4), their method processes in each 
iteration all the positions N times. The parameter val- 
ues are updated after each iteration, so as to minimize 
the total node count. Bjornsson and Marsland ap- 
plied their method for tuning the parameter values of 
the four search extensions: check, passed pawn, recap- 
ture, and one-reply extensions. Their results showed 
that their method optimizes fractional ply values for 
the above parameters, as the total node count for solv- 
ing the test set is decreased. 

Despite the success of this gradient-descent method for 
tuning the parameter values of the above four search 
extensions, it is difficult to use it efficiently for optimiz- 
ing a considerably larger set of parameters, which con- 
sists of all the selective search parameters mentioned 
in the previous section. This difficulty is due to the 
fact that unlike the optimization of search extensions 
for which the parameter values are mostly indepen- 
dent, other search methods (e.g., multi-cut pruning) 
are prone to a high interdependency between the pa- 
rameter values, resulting in multiple local maxima in 
the search space, in which case it is more difficult to 
apply gradient-descent optimization. 
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In the next section we present our method for auto- 
matically tuning all the search parameters mentioned 
in the previous section by using genetic algorithms. 

4. Genetic Algorithms for Tuning of 
Search Parameters 

In David-Tabibi et al. (2008a; 2009; 2010) we showed 

that genetic algorithms (GA) can be used to efficiently 
evolve the parameter values of a chess program's eval- 
uation function. Here we present a GA-based method 
for optimizing a program's search parameters. We first 
describe how the search parameters are represented as 
a chromosome, and then discuss the details of the fit- 
ness function. 

The parameters of the selective search methods which 
were covered in Section 2 can be represented as a bi- 
nary chromosome, where the number of allocated bits 
for each parameter is based on a reasonable value range 
of the parameter. Table 1 presents the chromosome 
and the range of values for each parameter (see Sec- 
tion 2 for a description of each parameter). Note that 
for search extensions fractional ply is applied, where 1 
ply = 4 units (e.g., an extension value of 2 is equivalent 
to half a ply, etc.). 



Parameter 


Value range 


Bits 


Null-move use 


0-1 


1 


Null-move reduction 


0-7 


3 


Null-move use adaptivity 


0-1 


1 


Null-move adaptivity depth 


0-7 


3 


Futility depth 


3 


2 


Futility threshold depth- 1 


0-1023 


10 


Futility threshold depth- 2 


0-1023 


10 


Futility threshold depth-3 


1023 


10 


Multi-cut use 


0-1 


1 


Multi-cut reduction 


0-7 


3 


Multi-cut depth 


7 


3 


Multi-cut move num 


0-31 


5 


Multi-cut cut num 


0-7 


3 


Check extension 


0-4 


3 


One-rc^ply c;xt(^nsi()n 


4 


3 


Recaijturc extiuisiou 


4 


3 


Passed pawn extension 


0-4 


3 


Mate threat extension 


0-4 


3 


Total chromosome length 




70 



Table 1. Chromosome representation of 18 search parame- 
ters (length: 70 bits). 

For the GA's fitness function we use a similar opti- 
mization goal to the one used by Bjornsson and Mars- 
land (2002), namely the total node count. A set of 
879 tactical test positions from the Encyclopedia of 
Ghess Middlegames (EGM) is used for training pur- 



poses. Each of these test positions has a predeter- 
mined "correct move", which the program has to find. 
In each generation, each organism searches all the 879 
test positions and receives a fitness score based on its 
performance. As noted, instead of using the number of 
solved positions as a fitness score, we take the number 
of nodes the organism visits before finding the cor- 
rect move. We record this parameter for each position 
and compute the total node count for each organism 
over the 879 positions. Since the search cannot con- 
tinue endlessly for each position, a maximum limit of 
500,000 nodes per position is imposed. If the organ- 
ism does not find the correct move when reaching this 
maximum node count for the position, the search is 
stopped and the node count for the position is set to 
500,000. Naturally, the higher the maximum limit, the 
larger the number of solved positions. However, more 
time will be spent on each position and subsequently, 
the whole evolution process will take more time. 

The fitness of the organism will be inversely propor- 
tionate to its total node count for all the positions. Us- 
ing this fitness value rather than the number of solved 
positions has the benefit of deriving more fitness in- 
formation per position. Rather than obtaining a 1-bit 
information for solving the position, a numeric value 
is obtained which also measures how qiiickly the posi- 
tion is solved. Thus, the organism is not only "encour- 
aged" to solve more positions, it is rewarded for finding 
quicker solutions for the already solved test positions. 

Other than the special fitness function described 
above, we use a standard GA implementation with 
Gray coded chromosomes, fitness-proportional selec- 
tion, uniform crossover, and elitism (the best organism 
is copied to the next generation). All the organisms 
are initialized with random values. The following pa- 
rameters are used for the GA: population size = 10, 
crossover rate = 0.75, mutation rate = 0.05, number 
of generations = 50. 

The next section contains the experimental results 
using the GA-based method for optimization of the 
search parameters. 

5. Experimental Results 

We used the Falcon chess engine in our experiments. 
Falcon is a grandmaster-level chess program which 
has successfully participated in three World Com- 
puter Chess Championships. Falcon uses NegaS- 
COi:t/PVS search, with null-move pruning, internal 
iterative deepening, dynamic move ordering (history 
-|- killer heuristic), multi-cut pruning, selective exten- 
sions (consisting of check, one-reply, mate-threat, re- 
capture, and passed pawn extensions), transposition 
table, and futility pruning near leaf nodes. 
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Each organism is a copy of Falcon (i.e., has the same 
evaluation function, etc.), except that its search pa- 
rameters, encoded 70-bit chromosome (see Ta- 
ble 1), are randomly initialized rather than manually 
tuned. 

The results of the evolution show that the total node 

count for the population average drops from 239 mil- 
lion nodes to 206 million nodes, and the node count 
for the best organism drops from 226 million nodes to 
199 million nodes. The number of solved positions in- 
creases from 488 in the first generation to 547 in the 
50th generation. For comparison, the total node count 
for the 879 positions due to Bjornsson and Marsland's 
optimization was 229 million nodes, and the number 
of solved positions was 508 (Bjornsson and Marsland, 
2002). 

To measure the performance of the best evolved or- 
ganism (we call this organism EvOL*), we compared 
it against the chess program Crafty (Hyatt, Gower, 
and Nelson, 1990). Crafty has successfully partic- 
ipated in numerous World Computer Chess Champi- 
onships (WCCC), and is a direct descendent of Cray 
Blitz, the WCCC winner of 1983 and 1986. It is fre- 
quently used in the literature as a standard reference. 

First, we let EvOL*, Crafty, and the original manu- 
ally tuned Falcon process the ECM test suite with 5 
seconds per position. Table 2 provides the results. As 
can be seen, EvoL* solves significantly more problems 
than Crafty and a few more than Falcon. 



EvOL* 


Falcon 


Crafty 


652 


645 


593 



Table 2. Number of ECM positions solved by each program 
(time: 5 seconds per position). 

The superior performance of EvOL* on the ECM test 
set is not surprising, as it was evolved on this training 
set. Therefore, in order to obtain an unbiased per- 
formance comparison, we conducted a series of 300 
matches between EvOL* and Crafty, and between 
EvoL* and Falcon. In order to measure the rating 
gain due to evolution, we also conducted 1,000 matches 
between EvOL* and 10 randomly initialized organisms 
(RandOrg). Table 3 provides the results. The table 
also contains the results of 300 matches between Fal- 
con and Crafty as a baseline. 

The results of the matches show that the evolved pa- 
rameters of EvOL* perform on par with those of FAL- 
CON, which have been manually tuned and refined for 
the past eight years. Note that the performance of 
Falcon is by no means a theoretical upper bound 
for the performance of EvOL*, and the fact that the 
automatically evolved program matches the manually 



Match 


Result 


W% 


RD 


Falcon - Crafty 


173.5 - 126.5 


57.8% 


+55 


EvoL* - Crafty 


178.5 - 121.5 


59.5% 


+67 


EvoL* - Falcon 


152.5 - 147.5 


51.1% 


+6 


EvoL* - RandOrg 


714.0 - 286.0 


71.4% 


+159 



Table 3. Falcon vs. Crafty, and Evol* vs. Crafty, 
Falcon, and randomly initialized organisms (W% is the 
winning percentage, and RD is the Elo rating diilerence). 

tuned one over many years of world championship level 
performance, is by itself a clear demonstration of the 
capabilities achieved due to the automatic evolution of 
search parameters. 

The results further show that EvOL* outperforms 
Crafty, not only in terms of solving more tactical test 
positions, but more importantly in its overall strength. 
These results establish that even though the search 
parameters are evolved from scratch (with randomly 
initialized chromosomes), the resulting organism out- 
performs a grandmaster-level chess program. 

6. Conclusions 

In this paper we presented a novel method for au- 
tomatically tuning the search parameters of a chess 
program. While past attempts yielded limited success 
in tuning a small number of search parameters, the 
method presented here succeeded in evolving a large 
number of parameters for several search methods, in- 
cluding complicated interdependent parameters of for- 
ward pruning search methods. 

The search parameters of the Falcon chess engine, 
which we used for our experiments, have been man- 
ually tuned over the past eight years. The fact that 
GA manages to evolve the search parameters auto- 
matically, such that the resulting performance is on 
par with the highly refined parameters of FALCON is 
in itself remarkable. 

Note that the evolved parameter sets are not nec- 
essarily the best parameter sets for every chess pro- 
gram. Undoubtedly, running the evolutionary process 
mentioned in this paper on each chess program will 
yield a different set of results which are optimized 
for the specific chess program. This is due to the 
fact that the performance of the search component 
of the program depends on other components as well, 
most importantly the evaluation function. For exam- 
ple, in a previous paper on extended null-move pruning 
(David- Tabibi and Netanyahu, 2008b), we discovered 
that while the common reduction value for null-move 
pruning isi? = 2ori? = 3,a more aggressive reduc- 
tion value of adaptive i? = 3 ~ 4 performs better for 
Falcon. It is interesting to note that our GA-based 
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method managed to independently find that these ag- 
gressive reduction values work better for Falcon. 
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